Did COVID impact vessel operations?
An analysis of several vessel/port factors, including geospatial representation.

Project Group - 25
¶

Members: </br>Yun-An LIN (5841682)</br> Rohan Menezes (5850908)</br> John Kuttikat (5765382)</br> Muhammad Rizki Ziarieputra (5848113)</br> Ian Trout (5851483)</br>

Contribution Statement¶

Yun-An LIN: Data for CPPI (both data frames and visualization)

Rohan Menezes: Streamlit Data Visualisation, Data filtering

John Kuttikat: Data manipulation and filtering, compilation of code, covid data

Muhammad Rizki Ziarieputra: Data gathering and analyzing port & covid data.

Ian Trout: Data research, Peak/Valley calculations, world data visualisation, narrative

Data Used¶

  1. Covid data (https://data.humdata.org/dataset/coronavirus-covid-19-cases-and-deaths)

  2. Port data (https://unctadstat.unctad.org/wds/TableViewer/tableView.aspx?ReportId=170027)

  3. Port Calls data (https://unctadstat.unctad.org/wds/TableViewer/tableView.aspx?ReportId=194890)

  4. The Container Port PERFORMANCE INDEX (CPPI) from World Bank Group (https://thedocs.worldbank.org/en/doc/66e3aa5c3be4647addd01845ce353992-0190062022/original/Container-Port-Performance-Index-2021.pdf)

INTRODUCTION¶

The impact of covid affected various areas of the economy around the world. Due to the rise in covid cases, cities were forced into a lockdown which affected their day-to-day activities. The whole economy and its supply chain came to a standstill. In addition, there was a tremendous increase in online shopping as people converted to living and working at home (if they could). This research study attempts to find correlations between COVID and marine logistics activities. This is achieved by primarily concentrating on different performance parameters, comparing different countries, in order to note any changes in performance data that may have been impacted due to the presence of COVID. The following sub-questions shall be answered in this report:</br></br> 1.Global impact of COVID on vessel waiting time and number of port calls.</br> 2.Variation in impact of COVID across the globe (country vs country comparison).</br> 3.Comparison of COVID cases to average country-wide port index (intermodal container ships only).</br> 4.Impact of COVID on other vessel factors (average size and average age).</br></br> These sub-questions shall be answered using the data collected for COVID cases, port data parameters, and the performance data throughout the past 2 years. Note that this report does compare pre-COVID port data with post-COVID port data when it is available (July 2018-July 2022).

1. Global impact of COVID on vessel waiting time and number of port calls¶

In [ ]:
# Running data import and manipulation file from 'TILProgramming-secondary'
%run ./TILProgramming-secondary.ipynb
C:\Users\john_\AppData\Local\Temp\ipykernel_38544\3741742687.py:22: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

C:\Users\john_\AppData\Local\Temp\ipykernel_38544\1666149402.py:7: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

At first, this report takes a global view of all ship types and the median time in port over the years (prior and during COVID). The median time in port suggests the total time (in days) the vessel stayed in a port to do all their relevant activities including load/unload its contents to/from the docks. The median was chosen due to the ‘long tail’ of the statistical distribution of time spent in ports (due to statistical outliers which are ships that spend weeks or months in a port, due to repairs). As you can see from Figure 1, the data obtained by the project team does not cover all the countries of the world but at least has over 15 countries in the world.

In [ ]:
fig = px.choropleth(geo_port_all, locations="ISO_A3",
                    title = "Figure 1.1: Map timeline of worldwide median time in port by country",
                    color = "median_time_in_port", 
                    hover_name = "country",
                    range_color = (0, 2),
                    animation_frame = "date",
                    color_continuous_scale = px.colors.sequential.Plasma
                )
fig.show()

Note that Figure 1.1 shows that Australia and the USA had the two highest time that vessels spent in port and that it has been pretty consistent since 2018. Next, this analysis compares the changes in COVID cases with the median time in port data in order to see if there are connections. Figure 1.2 below shows the comparison of the median time in port and the new COVID cases. New COVID cases is the number of people that contracted COVID during that six month period.

In [ ]:
df_combined_world = port_covid_world[port_covid_world['vessel_type'] == 'All ships']

# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(go.Scatter(x=df_combined_world['date'], y=df_combined_world['median_time_in_port'], name="median_time_in_port"), secondary_y=False)
fig.add_trace(go.Scatter(x=df_combined_world['date'], y=df_combined_world['new_cases'], name="New covid cases"), secondary_y=True) 

# Add figure title
fig.update_layout(title_text="Figure 1.2: Global new COVID cases and median time in port from 2018 to 2022")

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>primary</b> Median time in port (days)", secondary_y=False)
fig.update_yaxes(title_text="<b>secondary</b> New covid cases", secondary_y=True)

fig.show()

Before 2020, the trend of the 'median time in port' is relatively stable at 0.97 days. This shows the smooth performance of the port with a full set of employees and a steady schedule. After the COVID pandemic started, ‘median time in port’ had a significant increase (up to 1.07 days in July 2022 or a 10% increase compared to pre-2020) which means that the covid did have an influence on port operations.

From the trend pattern seen in the above figure, it is evident that the reason for the increased 'median time in port' is the shortage of workers due to the lockdown and isolation. Due to the increase in confirmed cases, the policy of working from home and restrictions to the number of people working in person, personnel available for efficient port operations was drastically reduced.

In [ ]:
df_combined_world_calls = portcalls_covid_world[portcalls_covid_world['vessel_type'] == 'All ships']

# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])

# Add traces
fig.add_trace(go.Scatter(x=df_combined_world_calls['date'], y=df_combined_world_calls['num_port_calls'], name="Num_port_calls"), secondary_y=False)
fig.add_trace(go.Scatter(x=df_combined_world_calls['date'], y=df_combined_world_calls['new_cases'], name="New covid cases"), secondary_y=True)

# Add figure title
fig.update_layout(title_text="Figure 1.3: Global number of port calls and new COVID cases by semester")

# Set x-axis title
fig.update_xaxes(title_text="Date")

# Set y-axes titles
fig.update_yaxes(title_text="<b>primary</b> Number of port calls", secondary_y=False)
fig.update_yaxes(title_text="<b>secondary</b> New covid cases", secondary_y=True)

fig.show()

In Figure 1.2, where the data shows that the median time in port increases due to inefficiencies in the operation of ports, there is still a lack of visibility on the actual number of vessels coming in. In other words, there is still a missing key data parameter to understand how the port's performance was affected due to berth demand (number of port calls) during the COVID period. Figure 1.3 shows this graphically. The trend of the number of port calls climbed gradually with a little reduction in 2019 until the beginning of 2020 which is the peak at 2.27M port calls.

In the first semester of 2020, there is a collapse of the number of port calls (a decrease of 400.000 calls). This is due to the total lockdown around the world, which nearly halted the entire complete supply chain process, creating worker unavailability in transhipment operations and unavailability of other resources.

However, after the sudden decrease in port calls in July 2020, it can be seen that the number of vessels coming in started to increase. This was the restarting phase of the global shipping market. What can’t be seen with Figures 1.2 and Figure 1.3 is that worker availability returned much slower than demand for shipping (number of port calls). This is due to deaths in the human population due to COVID, early retirement, and government stipends enabling port workers to live unemployed or seek other professions. Comparing Figure 1.2 and Figure 1.3, we can come to the conclusion that the reason for the increased median time in port is due to the increase in port calls with limited manpower in ports (January 2022 had the same port calls as pre-COVID but with a ‘median time in port’ of 1.05 days).

Another contributing factor might be the rapidly increasing demand for online shopping. Because of the lockdown, people cannot go shopping or do activities as frequently as usual. Thus, they tend to shop online which will cause more demand on the logistics and results in increases in port calls.

This section discussed, via the means of graphics, how globally the COVID pandemic affected the amount of time ships spent in port. However one can not simply assume from the analysis above that COVID affected port metrics equally in each country. The next section will explore that in more detail by doing a country by country analysis.

2. Variation in impact of COVID across the world (country vs country comparison)¶

The prior section showed the impact of COVID on a global scale. Let's zoom in a bit and see the effect on a more granular scale (i.e. individual countries across various regions). Before we dive into each individual country this report will look at major port economies in two distinct parts of the globe i.e. the west and the east. The countries selected for comparison are the USA for the west and Japan for the eastern part of the globe.

From Figure 2.1, one can note that there is an impact due to COVID on ‘median time spent in port’. Japan has a small increase of around 0.1 days, whilst the ‘median time spent in port’ in the USA rose from 1.42 to 1.7 days. The COVID cases are also very different in both the regions, with the USA having over 1.2M new cases in January 2022 versus Japan's 0.1M.

Median time in port and New cases vs time

To understand the difference of the COVID impact and its correlation with the ‘median time in port’, the project team used spearman’s correlation coefficient. The spearman’s correlation coefficient assesses how well the relationship between two variables can be described using a monotonic function.

Two values are presented, one is spearman’s coefficient and the other is the P-value. The coefficient returns a value between -1 and 1 that represents the limits of correlation from a full negative correlation to a full positive correlation. A value of 0 means no correlation.

The P-value is the probability that you would have found the current result if the correlation coefficient were in fact zero (null hypothesis). If this probability is lower than the conventional 5% (P<0.05) the correlation coefficient is called statistically significant.

We hypothesize that difference locations experience different impact from COVID on the median vessel time in port. We also hypothesize that eastern world countries gets impacted more than western world countries due to the fact that COVID started on eastern world.

Corelation coefficient for US, Japan and China

In [ ]:
spearman_covid_port = {'country':country_list_1, 'correlation_cf':correlation_c1, 'p_value':p_value_1}
spearman_covid_port = pd.DataFrame(spearman_covid_port)
spearman_covid_port
Out[ ]:
country correlation_cf p_value
0 United States of America 0.542857 0.265703
1 Japan 0.771429 0.072397
2 China 0.428571 0.396501

From just US and Japan data, we can see that our hypothesis holds due to the fact that Japan’s correlation coefficient is greater than USA meaning it is statistically more impacted. However, the coefficient value in China is lower than in the USA meening in China the impact of COVID was lower than that in the USA. Therefore our initial assumption that COVID caused a larger impact on the eastern part of the globe than the west is inaccurate.

To summarize our analysis, we conclude by analyzing all the countries available in the dataset using geospatial plotting to plot all the respective P-values. In Figure 2.2 below, one can note that the impact of COVID on individual countries occurred irrespective of its geographical location (North American, European, Oceanic, or SE Asia regions). This may signify that country-specific COVID mitigation measures had a stronger impact on port metrics than region-wide measures (eg. Norway has a P-value of nearly 1 whereas Sweden has a P-value of almost 0).

In [ ]:
spearman_covid_port = {'country':country_list_2, 'correlation_cf':correlation_c2, 'p_value':p_value_2}
spearman_covid_port = pd.DataFrame(spearman_covid_port)
In [ ]:
geo_spearman = pd.merge(df_geo, spearman_covid_port, on = 'country')

#this will be a world map for P-value/correlation coeff

fig = px.choropleth(geo_spearman, locations="ISO_A3",
                    title = "Figure 2.2: Variation in P-values and correlation coefficients across the globe",
                    color = "p_value", 
                    hover_name = "country",
                    range_color = (0, 1),
                    hover_data = ['correlation_cf'], 
                    color_continuous_scale = px.colors.sequential.Plasma)
fig.show()

In order to get another perspective, the project team analyzed another source of country-level data; container ship port indices. This is further explained in the next section.

3. Comparison of COVID cases to average country-wide port index (intermodal container ships only)¶

In the previous section, it was found that the severity of the impact of COVID varies across countries irrespective of geographic location. This section looks into other factors that might have changed the influence of COVID on ports; particularly whether countries with better ranking ports on average had a lesser impact on ‘median time spent in port’ due to COVID. In this section the report will look at whether the severity of changes in ‘median time in port’ due to impact of COVID has any relationship with the average port ranking of each country.

The port index data pertains to container ships only and was created by The World Bank. Therefore we must say: this is an adaptation of an original work by The World Bank. Views and opinions expressed in the adaptation are the sole responsibility of the author or authors of the adaptation and are not endorsed by The World Bank.

Definition of Port index: For the purposes of our research objective, port index is the factor analysis (fA) of a large data set to ascertain the impact of a series of measured variables (average time spent at berth/port-to-berth/total port hours) on an unseen latent variable (for example, in this case of efficiency), which cannot be measured directly with a single variable.

The process of fA is as follows: It is seen via the relationships it has with a series of visible and measurable variables, each of which contains information about the “efficiency” of the port. The latent variable, efficiency, is a function of each of the measured variables, and an error term for each. fA then determines the relative weight to be attached to each of the measured variables vis-à-vis the efficiency of the port, together with some uncertainty, which is captured by the error terms. For the remainder of this paper, 'fA' will be referred to as the ‘statistic average rank’. The closer the ‘statistic average rank’ is to 1, the better performing the port is.

In [ ]:
# Separate by the year
gb_year = port_performance.groupby("year")
per_2020 = gb_year.get_group(2020)
per_2021 = gb_year.get_group(2021)

# Figure
fig = go.Figure()
fig.add_trace(go.Bar(
    x=per_2020["country"],
    y=per_2020["statistic_approach_rank"],
    name='2020', marker_color='skyblue'
    
))
fig.add_trace(go.Bar(
    x=per_2021["country"],
    y=per_2021["statistic_approach_rank"],
    name='2021', marker_color='lightsalmon'
    
))

# Here we modify the tickangle of the xaxis, resulting in rotated labels.
fig.update_layout(
    title_text="Figure 3.1: Statistic average rank of each country"
)

# Set x-axis title
fig.update_xaxes(title_text="Country")
fig.update_xaxes(categoryorder='total ascending')
# Set y-axes titles
fig.update_yaxes(title_text="Statistic Average Rank")

fig.update_layout(barmode='group', xaxis_tickangle=-45)
fig.show()

Figure 3.1 above represents the statistic average rank of each country in 2020 and 2021. The lowering rank means higher performance. In 2020, Croatia apparently had the lowest average rank and France had the highest average rank. As for 2021, Japan and the United Kingdom have the lowest and highest ranks respectively.

In [ ]:
fig = go.Figure()
fig.add_trace(go.Bar(
    x=per_2020["country"],
    y=per_2020["median_time_in_port"],
    name='2020',
    marker_color='SkyBlue'
))
fig.add_trace(go.Bar(
    x=per_2021["country"],
    y=per_2021["median_time_in_port"],
    name='2021',
    marker_color='lightsalmon'
))

# Here we modify the tickangle of the xaxis, resulting in rotated labels.
fig.update_layout(
    title_text="Figure 3.2: Average median time in port for each country"
)

# Set x-axis title
fig.update_xaxes(title_text="Country")
fig.update_xaxes(categoryorder='total ascending')
# Set y-axes titles
fig.update_yaxes(title_text="Median Time in Port (days)")
fig.show()

The figure 3.2 above shows the ‘median time in port’ for each country in 2020 and 2021. Generally, the median time in port has an increase for most countries in 2021 which may be influenced by COVID.

In [ ]:
# Create figure with secondary y-axis
fig = make_subplots(specs=[[{"secondary_y": True}]])
port_performance.sort_values(["median_time_in_port"], inplace=True)
# Add traces
fig.add_trace(go.Bar(x=per_2020['country'], y=port_performance["median_time_in_port"], name="Median Time in Port"), secondary_y=False)
fig.add_trace(go.Scatter(x=per_2020['country'], y=port_performance["index_value"], name="Index Values", mode = "markers", 
                marker=dict(size=10)), secondary_y=True)

# Add figure title
fig.update_layout(title_text="The Comparsion of Average Median Time in Port and Index Values for each Country")

# Set x-axis title
fig.update_xaxes(title_text="Country")

# Set y-axes titles
fig.update_yaxes(title_text="<b>primary</b> Median Time in Port", secondary_y=False)
fig.update_yaxes(title_text="<b>secondary</b> Index Values", secondary_y=True)

fig.show()

Figure 3.3 combines figures 3.1 and 3.2; showing how the port indices vary impact median time in port. Overall, this section and section 2 above complement each other; the better the port performance index (E.G. Japan), the less vulnerability the country had with COVID impacting their port operations (Figure 2.1 from Section 2). The worse port performance index (E.G. USA), the more vulnerable the country was and the more COVID disrupted the logistic supply chain (starting with ports since that’s the data being analyzed in this report). In order to explore this more, the following section analyzes different port metrics within the USA.

4. Impact of COVID on other vessel factors (average size and average age)¶

In this section, the report will analyze one country during the time period of COVID data to see if there are links with two factors already discussed (median time in port and number of port calls) and two new factors: average age of vessels and average size of a vessel. The United States of America was picked due to the large baseline time in port prior to COVID and the good quality COVID reporting procedures (certain countries did not accurately report their COVID case numbers due to issues with testing, hospitals, etc). After calculating the peaks and valleys from the five data sets, we compare them to see if there are common dates or if there were common dates within a certain margin (this report chose 1 semester on either side). More details on the calculations of this are in the data manipulation jupyter notebook.

In [ ]:
# Then I start the figure and create several dictionaries that are necessary. The peaks and valleys dictionaries are for the graphs and the date dictionaries are for the next steps
fig_1 = go.Figure()
peaks_dict_1 = {}
valleys_dict_1 = {}
peaks_date_dict_1 = {}
valleys_date_dict_1 = {}

# I find the peaks and valleys and add them to the dictionaries
for activity in activities_story_1:
    max_ind = data_highs(geo_port_all_vessels, activity)
    peaks_dict_1[activity]=max_ind

    min_ind = data_lows(geo_port_all_vessels,activity)
    valleys_dict_1[activity]=min_ind
    
    # Then I turn them into dataframes to be able to use the dates for the graphs, and for the date dictionaries
    df_max_1 = geo_port_all_vessels.iloc[max_ind]
    df_min_1 = geo_port_all_vessels.iloc[min_ind]

# The date dictionaries are filled with the dates of the peaks and the valleys
    peaks_date_dict_1[activity] = df_max_1['date']
    valleys_date_dict_1[activity] = df_min_1['date']

#now the figure display
    x1 = geo_port_all_vessels['date']
    y1 = geo_port_all_vessels[activity]
    x2 = df_max_1['date']
    y2 = df_max_1[activity]
    x3 = df_min_1['date']
    y3 = df_min_1[activity]
    fig_1.add_trace(go.Scatter(x=x1,y=y1,name=activity))
    fig_1.add_trace(go.Scatter(x=x2,y=y2,mode='markers',name='peaks ' + activity))
    fig_1.add_trace(go.Scatter(x=x3,y=y3,mode='markers',name='valleys ' + activity))

# Set x-axis title
fig_1.update_xaxes(title_text="Date")


fig_1.update_layout(title='Figure 4.1: Trends of vessel port calls and average size through the years for USA')

fig_1.show()

From Figure 4.1 above, you can note that the average size of vessels has stayed relatively the same (between 16.3k (GT) and 17.6k (GT)). This holds true even when looking at pre-COVID semesters (shown in the Streamlit visualization). Meanwhile there is a notable decrease in port calls when the COVID pandemic starts in the 1st semester of 2020.

The other data available was the average age of vessels (years), which is plotted during the years of the COVID pandemic as shown in Figure 4.2 below. The average age of vessels increases slightly over time, roughly 1.5 years in age over 1 year. This signifies the slow aging of the vessels arriving into the USA but there is nothing significant about this data as old ships regularly get scrapped as new vessels get constructed and brought into service (900 vessels per year roughly).

In [ ]:
fig_2 = go.Figure()
peaks_dict_2 = {}
valleys_dict_2 = {}
peaks_date_dict_2 = {}
valleys_date_dict_2 = {}

# I find the peaks and valleys and add them to the dictionaries
for activity in activities_story_2:
    max_ind = data_highs(geo_port_all_vessels, activity)
    peaks_dict_2[activity]=max_ind

    min_ind = data_lows(geo_port_all_vessels, activity)
    valleys_dict_2[activity]=min_ind
    
    # Then I turn them into dataframes to be able to use the dates for the graphs, and for the date dictionaries
    df_max_2 = geo_port_all_vessels.iloc[max_ind]
    df_min_2 = geo_port_all_vessels.iloc[min_ind]

# The date dictionaries are filled with the dates of the peaks and the valleys
    peaks_date_dict_2[activity] = df_max_2['date']
    valleys_date_dict_2[activity] = df_min_2['date']

# Now the figure display is made
    x1 = geo_port_all_vessels['date']
    y1 = geo_port_all_vessels[activity]
    x2 = df_max_2['date']
    y2 = df_max_2[activity]
    x3 = df_min_2['date']
    y3 = df_min_2[activity]
    fig_2.add_trace(go.Scatter(x=x1,y=y1,name=activity))
    fig_2.add_trace(go.Scatter(x=x2,y=y2,mode='markers',name='peaks ' + activity))
    fig_2.add_trace(go.Scatter(x=x3,y=y3,mode='markers',name='valleys ' + activity))

# Set x-axis title
fig_2.update_xaxes(title_text="Date")

fig_2.update_layout(title='Figure 4.2: Median time ships are in port & average age of vessels through the years for USA')

fig_2.show()

Figure 4.3 below is a zoomed-in version of Part 1, Figure 2; focusing on just the period during COVID in order to see peaks and valleys in the data.

In [ ]:
fig_3 = go.Figure()
peaks_dict_4 = {}
valleys_dict_4 = {}
peaks_date_dict_4 = {}
valleys_date_dict_4 = {}

# I find the peaks and valleys and add them to the dictionaries
for activity in activities_story_4:
    max_ind = data_highs(geo_port_all_vessels, activity)
    peaks_dict_4[activity]=max_ind

    min_ind = data_lows(geo_port_all_vessels, activity)
    valleys_dict_4[activity]=min_ind
    
    # Then I turn them into dataframes to be able to use the dates for the graphs, and for the date dictionaries
    df_max_4 = geo_port_all_vessels.iloc[max_ind]
    df_min_4 = geo_port_all_vessels.iloc[min_ind]

# The date dictionaries are filled with the dates of the peaks and the valleys
    peaks_date_dict_4[activity] = df_max_4['date']
    valleys_date_dict_4[activity] = df_min_4['date']

# Now the figure display is made
    x1 = geo_port_all_vessels['date']
    y1 = geo_port_all_vessels[activity]
    x2 = df_max_4['date']
    y2 = df_max_4[activity]
    x3 = df_min_4['date']
    y3 = df_min_4[activity]
    fig_3.add_trace(go.Scatter(x=x1,y=y1,name=activity))
    fig_3.add_trace(go.Scatter(x=x2,y=y2,mode='markers',name='peaks ' + activity))
    fig_3.add_trace(go.Scatter(x=x3,y=y3,mode='markers',name='valleys ' + activity))

# Set x-axis title
fig_3.update_xaxes(title_text="Date")

fig_3.update_layout(title='Figure 4.3: Median time ships are in port through the years for USA')

fig_3.show()

From Figure 4.3 above, one can note that the median time in port in the USA has been increasing consistently except a small decrease between July 2020 and January 2021. This occurred after the initial wave of covid hit the USA and restrictions loosened (restrictions were put back in place around December 2020/January 2021).

In [ ]:
# Now we're going to compare the two date dictionaries and see if there are any common valleys (there are not peaks for the average size data set). 
common_valleys_2 = {}
common_valleys_2['in same semester'] = []
common_valleys_2['in months after'] = []
common_valleys_2['in months before'] = []

for date1 in valleys_date_dict_1[activity_4]:
    for date2 in valleys_date_dict_2[activity_2]:
            if date1.date() == date2.date():
                common_valleys_2['in same semester'].append(str(date1.date()))
            if date1.date() + relativedelta(months=+6) == date2.date():
                common_valleys_2['in months after'].append(str(date1.date()))
            if date1.date() + relativedelta(months=-6) == date2.date():
                common_valleys_2['in months before'].append(str(date1.date()))
                               
print('Low', activity_4, 'for which', activity_2, 'also has a valley in the same semester in the USA:', common_valleys_2['in same semester'])
print('Low', activity_4, 'for which', activity_2, 'also has a valley 6 months prior in the USA:', common_valleys_2['in months before'])
print('Low', activity_4, 'for which', activity_2, 'also has a valley 6 months after in the USA:', common_valleys_2['in months after'])
Low avg_size_of_vessel for which median_time_in_port also has a valley in the same semester in the USA: ['2021-01-31']
Low avg_size_of_vessel for which median_time_in_port also has a valley 6 months prior in the USA: []
Low avg_size_of_vessel for which median_time_in_port also has a valley 6 months after in the USA: []

The calculations above focus on the correlations between 2 of the port factors analyzed. From this analysis, median time in port is directly affected by the size of the vessel. Shipping companies can’t easily switch vessel sizes since most ships are in use at all times, vessel size is determined during the construction of the vessel, and vessels remain in service for three to five decades. Therefore COVID cases had no impact on the shipping company’s choice of their size of vessel. However, the calculations below show the cause-effect of new cases in COVID with relation to the port parameters of port calls and median time in port.

In [ ]:
# The same as for the previous questions, but now with valleys for covid and peaks for # of port calls and median time in port
common_peaks_2 = {}
common_peaks_2['in same semester'] = []
common_peaks_2['in months after'] = []

for date1 in valleys_date_dict_3[activity_1]: #COVID cases
    for date2 in peaks_date_dict_1[activity_3]:
        for date3 in peaks_date_dict_2[activity_2]:
            if date1.date() == date2.date():
                common_peaks_2['in same semester'].append(str(date1.date()))
            if date1.date() + relativedelta(months=+6) == date2.date():
                common_peaks_2['in months after'].append(str(date1.date()))
            if date1.date() == date3.date():
                common_peaks_2['in same semester'].append(str(date1.date()))
            if date1.date() + relativedelta(months=+6) == date3.date():
                common_peaks_2['in months after'].append(str(date1.date()))
            if date2.date() == date3.date():
                common_peaks_2['in same semester'].append(str(date2.date()))
            if date2.date() + relativedelta(months=+6) == date3.date():
                common_peaks_2['in months after'].append(str(date2.date()))
                               
print('Low COVID', activity_1, 'for which', activity_3, 'or', activity_2, 'also has a peak in the same semester in the USA:', common_peaks_2['in same semester'])
print('Low COVID', activity_1, 'for which', activity_3, 'or', activity_2, 'also has a peak 6 months after in the USA:', common_peaks_2['in months after'])
Low COVID new_cases for which num_port_calls or median_time_in_port also has a peak in the same semester in the USA: []
Low COVID new_cases for which num_port_calls or median_time_in_port also has a peak 6 months after in the USA: ['2021-07-31']
In [ ]:
# Three different graphs as the units are different, but in a subplot
fig_4 = make_subplots(rows=3,cols=1)

x1 = geo_port_all_vessels['date']
y1 = geo_port_all_vessels[activity_1]
x2 = df_max_3['date']
y2 = df_max_3[activity_1] 
x3 = df_min_3['date']
y3 = df_min_3[activity_1]
x4 = geo_port_all_vessels['date']
y4 = geo_port_all_vessels[activity_2]
x5 = df_max_2['date']
y5 = df_max_2[activity_2]
x6 = df_min_2['date']
y6 = df_min_2[activity_2]
x7 = geo_port_all_vessels['date']
y7 = geo_port_all_vessels[activity_3]
x8 = df_max_1['date']
y8 = df_max_1[activity_3]
x9 = df_min_1['date']
y9 = df_min_1[activity_3]

fig_4.append_trace(go.Scatter(x=x1,y=y1,name=activity_1),row=1,col=1)
fig_4.append_trace(go.Scatter(x=x2,y=y2,mode='markers',name='peaks ' + activity_1),row=1,col=1)
fig_4.append_trace(go.Scatter(x=x3,y=y3,mode='markers',name='valleys ' + activity_1),row=1,col=1)
fig_4.append_trace(go.Scatter(x=x4,y=y4,name=activity_2),row=2,col=1)
fig_4.append_trace(go.Scatter(x=x5,y=y5,mode='markers',name='peaks ' + activity_2),row=2,col=1)
fig_4.append_trace(go.Scatter(x=x6,y=y6,mode='markers',name='valleys ' + activity_2),row=2,col=1)
fig_4.append_trace(go.Scatter(x=x7,y=y7,name=activity_3),row=3,col=1)
fig_4.append_trace(go.Scatter(x=x8,y=y8,mode='markers',name='peaks ' + activity_3),row=3,col=1)  
fig_4.append_trace(go.Scatter(x=x9,y=y9,mode='markers',name='valleys ' + activity_3),row=3,col=1)

fig_4.update_layout(title='Fig 4.4: Covid and port trends in ' + country_1)
fig_4.update_xaxes(title_text="Date")


fig_4.show()

As shown in Figure 4.4 above, COVID cases did impact the number of port calls in the beginning of the pandemic and even in 2022, where there are still 20k less port calls in the USA than before the pandemic. Even with less port calls, the median time in port has been increasing almost consistently since the beginning of the pandemic, taking almost 0.5 days more than prior to the pandemic. This means that there are structural dysfunctions that were accentuated when the COVID pandemic struck, notably the inability to automate port and hinterland activities to be able to continue the quick handling of vessels once they arrive. As the work force operating in the port (including truckers and train operators taking the goods to the hinterland) became sick and had to quarantine, there was no contingency plan to continue port operations. In addition, the graphical trend of new COVID cases closely matches that of the number of port calls, suggesting that demand for goods (which loosely can be inferred from the number of port calls) is closely linked to COVID cases. This will be further discussed in the conclusion.

CONCLUSION:¶

In summary, the port performance index, spearman's correlation calculations, and peaks/valleys calculations in the USA demonstrate that COVID cases did and still do have an impact on port operations. This is more or less relevant for many countries as discussed in sub-questions 2 and 3. Some countries fared better such as Japan; which kept their ‘median time in port’ relatively stable throughout the last 2 years of the COVID pandemic although statistically speaking, they were still impacted due to COVID cases in their country.

By looking at different vessel factors (average size and age) and different port factors (port index, median time in port, and the number of port calls), we can conclusively say that COVID did impact cargo vessel operations but that it varied greatly based on the individual country. We can infer that different port capabilities

Our Streamlit visualization provide users with customizable data that they wish to see (whether it’s country comparisons, or specific vessel parameters (such as the average size of vessel or vessel type such as liquid bulk carriers). It is really a plug-and-play feature that allows the reader to further explore the data and do further analyses.

Follow steps to view streamlit: to run on the webpage : go to cmd go to the file path where this file is located using command 'cd'</br> then type streamlit run port1.1.py